AFRL-SA-AR-TR-10-0338 


[ACTIVE-VISION  CONTROL  SYSTEMS  FOR  COMPLEX 
ADVERSARIAL  3-D  ENVIRONMENTS  ] 


Eric  N.  Johnson,  Principal  Investigator 

Anthony  Calise,  Alien  Tannenbaum,  Anthony  Yezzi, 

Stefano  Soatto,  George  Barbastathis,  and  Naira  Hovakimyan 

Georgia  Institute  of  Technology 

MARCH  2009 
Final  Report 


DISTRIBUTION  A:  Distribution  approved  for  public  release. 


AIR  FORCE  RESEARCH  LABORATORY 
AF  OFFICE  OF  SCIENTIFIC  RESEARCH  (AFOSR)/RSL 
ARLINGTON,  VIRGINIA  22203 
AIR  FORCE  MATERIEL  COMMAND 


20101202163 


I 

Raytheon  Company  Limited  Data  Rights 
Data  subject  to  restrictions  on  cover  and  Notice  page 


REPORT  DOCUMENTATION  PAGE 


AFRL-SR-AR-TR- 10-0338 


The  public  reporting  burden  for  th<$  colfecbon  of  riformation  is  estimated  lo  average  1  hour  per  response.  iiKiudiog  the  tima  for  i 
maintaining  the  data  r)eeded,  arxl  compietir>g  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  • 
suggestions  for  reducing  the  burden,  to  the  Department  of  Oeferwe,  Executive  Service  Directorate  (0704-0188).  Respondents  t 
person  shall  be  sut^ect  to  any  penalty  for  failing  to  comply  with  a  collection  of  information  if  it  does  rx>t  display  a  currently  valid  0MB  control  numoer. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ORGANIZATION. 


1.  REPORT  DATE  fDO-MM-YWY} 

31-03-2009 


2.  REPORT  TYPE 


Final  Report 


4.  TITLE  AND  SUBTITLE 

ACTIVE-VISION  CONTROL  SYSTEMS  FOR 

COMPLEX  ADVERSARIAL  3-D  ENVIRONMENTS 


6.  AUTHOR(S) 

Eric  N.  Johnson,  Principal  Investigator 

Anthony  Calisc,  Allen  Tanncnbaum,  Anthony  Yezzi,  Stefano  Soatto,  George 
Barbastathis,  and  Naira  Hovakimyan 


3.  DATES  COVERED  fFnom  -  To; 

I  01-07-2003  to  31-12-2008 

5a.  CONTRACT  NUMBER 

F49620-03- 1-0401 


5b.  GRANT  NUMBER 


5c.  PROGRAM  ELEMENT  NUMBER 


5d.  PROJECT  NUMBER 

E-16-V91 


5e.  TASK  NUMBER 


5f.  WORK  UNIT  NUMBER 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 
Eric  Johnson 

Georgia  Institute  of  Technology 
270  Fcrst  Drive 
Atlanta  GA  30332-0150 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Sponsor  technical  contact:  FARIBA  FAHROO 
AFOSR/NM 

875  NORTH  RANDOLPH  STREET 
SUITE  325,  ROOM  31 12 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


10.  SPONSOR/MONITOR’S  ACRONYM(S) 

AFOSR/NM 


11.  SPONSOR/MONITOR’S  REPORT 

NUMBER(S) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 
Unlimited 


13.  SUPPLEMENTARY  NOTES 


14.  ABSTRACT 

This  project  has  included  development  of  methods  that  utilize  2-D  and  3-D  imagery  (e.g.,  from  visual,  FLIR,  LADAR,  acoustic)  to  enable  aerial 
vehicles  to  autonomously  detect  and  prosecute  targets  in  uncertain  complex  3-D  adversarial  environments,  including  capabilities  and  approaches 
inspired  by  those  found  in  nature,  and  without  relying  upon  highly  accurate  3-D  models  of  the  environment.  The  new  capabilities  of  autonomous 
sensing  and  control  enable  UAV/munition  operations:  in  a  clandestine/covert  manner;  in  close  proximity  to  hazards,  structures,  and/or  terrain;  and 
in  uncertain/adversarial  3-D  environments.  Furthermore,  the  team  performed  a  productive  flying  testbed  activity  as  part  of  the  program.  This 
ensures  that  the  methods  are  sound  in  the  sense  that  they  are:  (1)  implementable  in  real-time,  (2)  capable  of  practical  use  in  the  field,  and  (3)  based 
on  rcalistic/achievablc  sensor  capabilities.  This  project  is  a  Multidisciplinary  University  Research  Initiative  (MURI). 


15.  SUBJECT  TERMS 

Vision,  Control,  Tracking 


16.  SECURITY  CLASSIFICATION  OF; 

17.  LIMITATION  OF 

ABSTRACT 

18.  NUMBER 

OF 

PAGES 

40 

19a.  NAME  OF  RESPONSIBLE  PERSON 

a.  REPORT 

b.  ABSTRACT 

C.  THIS  PAGE 

Eric  Johnson 

U 

U 

U 

none 

19b.  TELEPHONE  NUMBER  (Include  area  code) 
404-385-2519 

Standard  Form  298  (Rev.  8/98) 

PrescritXKl  by  ANSI  Std  Z39 18 
AdolM  Profession*!  7.0 


ACTIVE-VISION  CONTROL  SYSTEMS  FOR 
COMPLEX  ADVERSARIAL  3-D  ENVIRONMENTS 


Final  Report 

AFOSR  F49620-03- 1-0401 
For  the  period  July  1 ,  2003  to  December  3 1 ,  2008 
March  3 1,2009 

Eric  N.  Johnson,  Principal  Investigator 
Schools  of  Aerospace  Engineering 
Georgia  Institute  of  Technology 
Atlanta,  GA  30332-0150 
(Tel)  404-385-2519 
(Fax)  404-894-2760 
eric.j  ohnson@ae.gatech.edu 

Anthony  J.  Calise 
School  of  Aerospace  Engineering 
Georgia  Institute  of  Technology 

Allen  R.  Tannenbaum,  Anthony  J.  Yezzi,  Jr. 
School  of  Electrical  and  Computer  Engineering 
Georgia  Institute  of  Technology 

Stefano  Soatto 

Department  of  Computer  Science 
University  of  California  at  Los  Angeles 

George  Barbastathis 
Department  of  Mechanical  Engineering 
Massachusetts  Institute  of  Technology 

Naira  Hovakimyan 

Department  of  Mechanical  Science  and  Engineering 
University  of  Illinois  at  Urbana-Champaign 


Active-Vision  Control  Systems  MURI  Final  Report 


1 


Abstract 


This  project  has  included  development  of  methods  that  utilize  2-D  and  3-D  imagery  (e.g., 
from  visual,  FLIR,  LADAR,  acoustic)  to  enable  aerial  vehicles  to  autonomously  detect 
and  prosecute  targets  in  uncertain  complex  3-D  adversarial  environments,  including 
capabilities  and  approaches  inspired  by  those  found  in  nature,  and  without  relying  upon 
highly  accurate  3-D  models  of  the  environment.  The  new  capabilities  of  autonomous 
sensing  and  control  enable  UAV/munition  operations:  in  a  clandestine/covert  manner;  in 
close  proximity  to  hazards,  structures,  and/or  terrain;  and  in  uncertain/adversarial  3-D 
environments.  This  project  is  a  Multidisciplinary  University  Research  Initiative  (MURI). 
The  critical  technical  innovations  we  are  bringing  to  bear  on  the  problem  include: 

1 .  Knowledge-based  segmentation; 

2.  Adaptation  and  estimation  in  geometric  active  contours; 

3.  Adaptive  control  frameworks  for  active  vision  systems; 

4.  Multigrid  and  polygonal  methods  for  optical  flow; 

5.  Imaging  sensors  designed  to  produce  sensor  information  for  control. 

Furthermore,  the  team  performed  a  productive  flying  testbed  activity  as  part  of  the 
program.  This  ensures  that  the  methods  are  sound  in  the  sense  that  they  are:  (1) 
implementable  in  real-time,  (2)  capable  of  practical  use  in  the  field,  and  (3)  based  on 
realistic/achievable  sensor  capabilities. 

Final  Status  of  Effort 

Our  team  has  completed  the  five  years  of  this  MURI  along  with  a  small  part  of  the  effort 
that  continued  on  a  contract  extension  for  an  additional  six  months.  During  that  time,  our 
first  major  focus  was  the  generalized  visual  tracking  problem,  that  is,  the  tracking  of 
objects/features  based  on  real-time  imagery.  Successful  tracking  allows  for  the 
utilization  of  visual  information  of  an  airborne  target  in  the  feedback  loop  for  the 
purposes  of  pursuit,  evasion,  or  formation  flight.  Utilization  for  a  ground  target,  our 
second  focus,  enables  automated  pursuit/surveillance.  When  2-D  vision  sensors  are  used, 
estimating  target  range  is  challenging  -  and  a  number  of  approaches  have  been  studied, 
including  utilization  of  target  size/shape  in  the  image,  optimal  guidance  policies,  and  use 
of  adaptation.  In  the  third  year,  vision-based  formation  flight  between  two  aircraft  was 
successfully  accomplished.  Subsequently,  experimental  work  moved  to  more  complex 
formation  flight  scenarios  with  the  objective  of  vision-based  pursuit  utilizing  vision 
sensors  only.  In  addition,  several  results  involving  visual  tracking  of  a  ground  target 
have  been  produced  -  our  second  major  focus.  In  addition,  progress  was  made  on  a  third 
(related)  focus,  the  fixed  object  detection  problem.  The  issues  are  similar  (e.g.  the 
difficulty  of  estimating  range),  as  are  the  methods  we  are  exploring  to  tackle  the  problem. 
There  have  been  extensive  interactions  with  AFRL/MNGN  in  support  of  related  activities 
there.  This  report  covers  the  entire  project.  The  section  that  immediately  follows  covers 
background  material.  This  is  followed  by  a  description  of  the  methods  developed  under 
the  project  and  then  flight  testing  results. 
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To  summarize  what  was  accomplished  in  terms  of  major  flight  testing  events,  with  links 
to  videos: 

1 .  Vision-only  formation  flight  and  pursuit 

1.1.  Helicopter  maintaining  formation  with  an  airplane,  vision-data  only 
hup:  uav.ae.gatech.eda  videos  ef06061 5dl  first VisionBasedFormation.mpg 

http:/  uav.ae.i;atech.eduyvideos/cfD60615dlob  firstVisionBasedFormation.mpg 

1 .2.  Airplane  maintaining  formation  with  an  airplane,  vision-data  only 
httD://uav.ae.aatech.edu/videos.'ev070730b2  visionBasedFormation.wmv 

http://uav.ae.gatech.edu/videos  ev070730b2ob  visionBasedFormation.wTnv 

2.  Vision-based  ground  vehicle  tracking/following 
2.1.  Following  a  truck  based  on  camera  image  only 

http:/,  uav.ae.gatech.edu.'videos/fD61 1 13bl  goodCarTrack.wmv 

http://uav.ae.gatech.edu/videos.'f061 1 13blob  goodCarTrack.wmv 


3.  Vision-based  obstacle  avoidance 

3.1 .  Vision-data  only  to  avoid  a  fixed  obstacle 

http:,. uav.ae.gatech.edu- videos  fD71030a2  autoAvoidl.wiTiv 

http://uav.ae.gatech.edu/videos/ro71030a2ob  autoAvoidl.wTnv 

It  is  worth  noting  that  these  flight  testing  activities  utilized  the  advanced  methods 
developed  under  this  effort,  including  geometric  active  contours,  particle  filtering,  neural 
network  augmentation  of  an  extended  Kalman  filter,  and  others  as  noted  in  this  report. 

Note:  References  to  other  work  are  given  numerically,  and  listed  in  the  references 
section.  References  listed  by  Author  and  year  are  publications  derived  from  this  effort, 
listed  in  the  Publications  list.  The  publications  list  also  includes  publications  not  directly 
referred  to  in  this  report. 
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Background 

Vision  and  Control 

Prior  to  the  effort,  we  had  been  working  on  problems  in  image  processing  and  computer 
vision  using  geometry-based  differential  equations,  invariant  theory,  and  statistical 
methods  for  various  purposes  including  segmentation,  edge  detection,  image 
enhancement,  de-noising,  registration,  surface  warping,  morphology,  stereo,  optical  flow, 
shape  representation  and  object  recognition  [1], 

Boundary  Based  Tracking  Using  PDEs  and  Active  Contours 

Tracking  by  active  contours  (also  known  as  snakes)  was  an  established  method  in 
controlled  active  vision.  In  2-D  images,  virtual  forces  derived  from  the  images  drive  a 
parametrically  defined  line  with  constraints  on  how  it  can  deform.  Such  a  virtual  force 
can,  for  example,  be  derived  from  the  local  edge  strength.  The  parametric  line,  or  the  so- 
called  snake,  is  then  attracted  to  the  edges  of  the  images,  forming  an  outline  of  the  object 
at  hand.  The  modem  approach  to  active  contours  is  based  on  a  more  rigorous 
mathematical  framework.  Snake-based  tracking  using  mean  curvature  evolution  schemes 
can  be  a  powerful  tool  in  real-time  tracking,  segmentation  and  target  recognition  [2]. 

Active  contours,  or  snakes,  are  autonomous  processes  that  employ  image  coherence  in 
order  to  track  features  of  interest  over  time.  In  the  past  few  years,  a  number  ,  of 
approaches  have  been  proposed.  The  underlying  principle  in  these  approaches  are  based 
upon  the  utilization  of  deformable  contours  that  conform  to  various  object  shapes  and 
motions.  Snakes  have  been  used  for  edge  and  curve  detection,  segmentation,  shape 
modeling,  and  especially  for  visual  tracking.  We  have  developed  and  extended  a 
deformable  contour  model  that  is  derived  from  a  generalization  of  the  curve  shortening 
evolution.  It  is  based  on  the  geometric  intuition  of  multiplying  the  Euclidean  arc-length 
by  a  function  tailored  to  the  features  of  interest  to  which  we  want  to  flow,  and  then 
computing  the  resulting  gradient  flow  equations.  This  leads  to  interesting  new  models 
that  efficiently  attract  the  given  active  contour  to  the  desired  feature.  The  methods 
generalize  naturally  to  3-D  or  4-D.  The  resulting  active  contour  models  have  the  ability 
to  change  topology  (automatic  merging  and  breaking  of  contours),  essential  to  tracking 
multiple  objects  and  tracking  in  clutter.  Figure  1  illustrates  the  process  of  “bubbles” 
(expanding  deformable  contours)  finding  a  truck  in  an  image. 
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Figure  1 :  Active  contour  bubbles  capturing  a  truck  in  an  image;  note  the  automatic 
merging  and  topological  changes  involved  in  finding  the  image. 

Knowledge-Based  Segmentation 

In  knowledge-based  segmentation,  these  tracking  approaches  are  augmented  with 
knowledge  of  object  shape  to  guide  segmentation  in  uncertain  regions.  A  natural  way  of 
doing  this  which  combines  the  statistical  and  curvature  driven  approaches  is  to  smooth 
the  posterior  probabilities  and  then  extract  a  maximal  a  posteriori  (MAP)  classification  in 
segmenting  the  given  image.  More  precisely  in  the  Bayesian  framework,  we  can 
calculate  the  posteriors  P'^i  =  Pr(Ci  =  cjVi  =  v)  (the  C’s  are  the  possible  classes  and  the  V’s 
the  intensities)  and  smooth  by  evolving  P'  according  to  the  affine  geometric  heat  flow 
equation,  under  which  the  level  sets  of  P'  undergo  affine  curve  shortening  whilst 
preserving  edges  [3,4].  Shape  information  may  be  introduced  into  the  image 
segmentation  process  using  this  geometric  variational  framework.  The  idea  is  to 
introduce  a  representation  for  shapes  and  define  a  probability  distribution  over  the 
variances  of  a  set  of  “training  images”.  Then,  in  order  to  segment  a  structure  from  an 
image,  one  can  evolve  a  geometric  active  contour  using  local  information  and  globally  to 
a  maximum  a  posterior  estimate  of  shape. 

Adaptive  Learning,  Noise  Models,  and  Geometric  Active  Contours 

A  key  element  in  our  approach  is  to  advance  our  algorithmic  research  on  PDEs  and  active 
contours.  At  the  same  time,  we  want  to  supplement  the  statistical  methods  discussed 
above  with  techniques  that  leverage  anatomical  knowledge,  primarily  the  PDE  and  level 
set  methods.  So  far,  we  have  only  considered  simple  prior  distributions  and  adaptation 
techniques.  At  present,  the  weighting  factor  is  derived  locally,  based  on  edge 
computations.  A  more  flexible  conformal  metric  will  be  obtained  if  the  metric  is  learned 
from  the  data  and  if  the  model  incorporates  non-local  information.  For  this  purpose,  we 
will  explore  the  use  of  adaptive  filtering.  We  also  plan  to  incorporate  Bayesian  statistics 
into  the  stopping  (conformal  weighting)  rule  in  the  geometric  active  contour  model.  The 
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classical  snake  cost-function  is  based  on  the  minimization  of  length  in  the  plane  or  of 
surface  area  in  space.  We  will  implement  alternative  cost  terms  that  will  be  based  on  the 
minimization  of  other  natural  geometric  quantities  (such  as  area  in  the  plane  or  volume  in 
space)  [5].  We  also  consider  methods  for  explicitly  coupling  boundary  and  region  data 
within  the  geometric  active  contour  framework  [6-9].  We  plan  to  extend  the  techniques 
based  on  the  concept  of  minimum  description  length  (MDL).  The  idea  is  to  consider  the 
segmentation  problem  as  a  partitioning  problem,  where  the  criterion  for  choosing  one 
partition  instead  of  another  is  the  description  length.  The  measure  of  the  description 
length  must  be  accomplished  according  to  some  a  priori  language.  Thus,  it  is  essentially 
equivalent  to  the  maximum  a  posteriori  (MAP)  estimate  from  the  Bayesian  paradigm.  It- 
may  be  regarded  as  an  information  interpretation  of  this  classical  method. 

Robust  Tracking  of  Deforming  Targets 

Earlier  [10],  we  proposed  a  framework  that  allows  us  to  separate  the  overall  motion  from 
the  more  general  deformation.  We  have  also  extended  this  framework  to  handle 
occlusions,  as  a  particular  type  of  deformation  [11].  The  key  idea  underlying  our 
framework  is  that  the  notion  of  motion  throughout  a  deformation  is  very  tightly  coupled 
with  the  notion  of  shape  average.  In  particular,  if  a  deforming  object  is  recognized  as 
moving,  there  must  be  an  underlying  object  (which  will  turn  out  to  be  the  shape  average) 
moving  with  the  same  motion,  from  which  the  original  object  can  be  obtained  with 
minimum  deformations.  Therefore,  we  will  model  a  general  deformation  as  the 
composition  of  a  group  action  g  on  a  particular  object,  on  top  of  which  a  local 
deformation  is  applied.  The  shape  average  is  defined  as  the  one  that  minimizes  such 
deformations.  The  goal  is,  given  a  collection  of  images  that  contain  a  given  target,  to 
estimate  both  its  motion  (a  finite-dimensional  group)  and  its  average  shape.  Some  of  our 
prior  work  was  used  as  a  starting  point  for  studying  these  issues;  see  in  particular  [10-12]. 

Adaptive  Estimation  and  Control 

Nonlinear  Estimation  and  Adaptive  Control:  Existing  methods  for  nonlinear  state 
estimation  impose  assumptions  that  severely  limit  their  domain  of  applicability,  such  as 
to  systems  that  are  linear  with  respect  to  unknown  parameters,  or  systems  that  can  be 
transformed  to  output  feedback  form.  Neural  network  (NN)  based  adaptive  observers 
have  relaxed  some  of  these  assumptions;  however  robustness  to  unmodeled  dynamics  and 
disturbances  has  not  been  addressed.  We  have  recently  developed  a  methodology  for 
adaptive  state  estimation  of  bounded  nonlinear  processes.  The  approach  augments  an 
existing  linear  observer  with  two  NNs  that  model  the  uncertainties  from  a  finite  history  of 
available  measurements  [13].  This  approach  is  adaptive  to  both  unmodeled 
nonlinearities  and  unmodeled  dynamics,  precisely  the  situation  commonly  encountered  in 
image  processing  applications. 

Adaptive  Guidance  and  Flight  Control:  Here,  we  explore  direct  utilization  of  vision  data 
in  guidance  and  flight  control.  We  are  approaching  this  topic  from  the  perspective  of 
using  only  vision  data  to  analyze  the  environment  in  which  the  vehicle  must  be  flown, 
and  to  pursue  targets  within  this  environment.  The  use  of  NN  based  adaptive  control  for 
flight  control  has  been  extensively  developed  and  applied  by  our  group  [14-16].  We  have 
also  initiated  several  collaborative  efforts  at  Eglin  AFB  in  the  area  of  cooperative  flight 
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control  and  adaptive  missile  autopilot  design.  The  research  aspects  particular  to  guidance 
and  flight  control  that  are  new  to  this  effort  will  involve  those  aspects  associated  with 
pursuing  a  target  in  a  highly  congested/adversarial  environment.  We  envision  a  space  in 
which  a  vehicle  must  be  flown  so  as  to  detect  and  pursue  a  target  while  avoiding  both 
fixed  obstacles  and  moving  threats.  Adaptation  is  required  in  order  to  capture  the 
unknown  and  unmodeled  dynamics  associated  with  moving  targets  and  threats  in  the 
presence  of  wind/gust  disturbances  [17]. 

Sensor  Design 

We  believe  that  real-time  control  of  autonomous  airborne  vehicles  can  be  enhanced  by 
image  sensor  information  that  is  at  the  same  time  rich  and  selective.  “Richness” 
quantifies  the  amount  of  information  (in  the  Shannon  sense)  transferred  through  the 
sensor.  “Selectivity”  refers  to  the  ability  to  discriminate  the  information  that  is  most 
relevant  to  the  mission  (e.g.,  the  location  and  distribution  of  targets  on  the  ground)  from 
irrelevant  information  (e.g.,  the  grass  on  the  ground).  To  meet  this  challenge,  we  will 
develop  novel  types  of  optical  imaging  sensors  uniquely  meeting  two  objectives:  (i)  the 
sensors  will  be  optimized  in  terms  of  information  quantity  and  quality  (ii)  the  sensor 
outputs  will  be  optimized  to  serve  as  input  to  the  active  vision  control  algorithms. 
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Accomplishments 

Dynamic  Active  Contours 

Active  contours  (also  known  as  snakes)  are  autonomous  processes  employing  image 
coherence  in  order  to  track  features  of  interest  over  time.  They  are  capable  of 
conforming  to  objects  in  the  image  plane,  making  them  ideal  for  segmentation,  edge 
detection,  shape  modeling,  and  visual  tracking.  To  overcome  the  local  nature  of  active 
contours,  statistical  and  adaptive  based  pre-processing  can  be  integrated  into  the  stopping 
criterion,  the  inflation  mapping,  and/or  the  gains  to  more  effectively  drive  the  contour  to 
the  desired  minima.  The  ability  of  the  snakes  to  change  topology  and  quickly  capture 
desired  features  makes  them  an  indispensable  tool  for  our  visual  tracking  algorithms.  In 
knowledge-based  segmentation,  the  ability  to  track  targets  is  enhanced  through 
knowledge  of  image  content  for  simplification  and  noise  removal.  Object  shape  can  also 
be  incorporated  to  improve  targeting  of  desired  objects  and  reduce  false-positives.  A 
natural  way  to  incorporate  knowledge-based  techniques  in  an  adaptive  framework  is  to 
use  maximum  a  posteriori  (MAP)  classification  for  segmentation  of  an  image.  An 
example  of  how  the  MAP  algorithm  can  reduce  irrelevant  image  content  for  improved 
segmentation  and,  in  the  process,  provide  an  unambiguous  minima  to  the  active  contour 
is  shown  in  Figure  2. 


Figure  2:  Sample  image  processing  with  background  clutter 


The  generalized  tracking  problem  necessarily  involves  acquiring  visual  feedback  from  a 
dynamically  changing  external  world.  Although  the  algorithms  discussed  above  perform 
well,  they  were  initially  developed  for  solve  static  problems.  We  seek  to  implement  the 
dynamic  version  of  geometric  active  contours  for  improved  robustness  to  background 
noise  and  obstacles  within  the  tracking  context.  Also,  the  MAP  classification  technique 
for  knowledge-based  segmentation  of  imagery  relies  on  certain  fixed  assumptions,  such 
as  a  static  number  of  classes  to  segment  the  image  into.  However,  as  the  nature  of  the 
terrain  and  the  sky  vary,  so  can  the  number  of  classes.  We  have  also  investigated 
methods  to  dynamically  adjust  the  number  of  classes.  Doing  so  reduces  the  probability 
of  losing  a  tracked  target  in  the  segmentation  process. 

Particle  Filtering  for  Geometric  Active  Contours 

Although  the  algorithms  discussed  above  perform  well,  they  were  initially  developed  for 
solve  static  problems.  We  have  implemented  the  dynamic  versions  of  geometric  active 
contours  for  improved  robustness  to  background  noise  and  obstacles  within  the  tracking 
context.  Tracking  algorithms  using  Kalman  filters  or  particle  filters  have  been  proposed 
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for  finite  dimensional  representations  of  shape,  but  these  are  dependent  on  the  chosen 
parameterization  and  cannot  handle  changes  in  curve  topology.  Geometric  active 
contours  provide  a  framework  which  is  parameterization  independent  and  allow  for 
changes  in  topology.  We  formulated  a  particle  filtering  algorithm  in  the  geometric  active 
contour  framework  that  can  be  used  for  tracking  moving  and  deforming  objects  [Ha, 
Johnson,  Tannenbaum  2008].  To  the  best  of  our  knowledge,  this  is  the  first  attempt  to 
implement  an  approximate  particle  filtering  algorithm  for  tracking  on  a  (theoretically) 
infinite  dimensional  state  space. 

In  Figure  2,  we  track  a  van  moving  amid  clutter  in  the  background.  There  is  sudden  and 
large  motion  of  the  van  (in  some  cases,  the  van  moves  more  than  20  pixels  between 
consecutive  frames)  due  to  jitter  in  the  camera  motion.  Furthermore,  it  gets  largely 
occluded  (only  a  small  fraction  of  the  van  is  visible)  many  times  by  buildings  or  trees. 
Tracking  such  a  sequence  using  active  contours  alone  is  bound  to  fail  since  the  van  may 
lie  outside  the  basin  of  attraction  of  the  starting  contour.  As  shown  in  Figure  2,  the 
proposed  method  tracks  the  van  successfully  despite  large  motion  and  occlusion.  For  this 
test  sequence,  no  motion  model  was  adapted,  i.e.,  the  state  transition  assumed  known 
with  Gaussian  noise.  The  figure  shows  tracking  results  with  50  particles. 


Figure  2:  Tracking  van  sequence  through  occlusions  by  adding  a  particle  filter,  note  van 
going  partially  behind  tree.  Segmentation  shown  with  red  curve. 


The  video  sequence  sampled  in  Figure  3  has  a  very  low  contrast  and  in  general,  it  is  very 
difficult  to  locate  the  boundary  of  the  airborne  target.  The  motion  of  the  airplane  from 
one  frame  to  the  other  is  also  quite  large,  hence  traditional  active  contour  based  methods 
fail  to  track  the  plane.  In  this  experiment,  only  translational  motion  was  assumed  for  the 
moving  airplane.  Figure  3  shows  a  few  frames  of  the  tracking  results.  Even  though,  no 
scale  parameter  was  included  in  the  motion  model,  the  contour  deformation  part  of  the 
algorithm  adjusts  for  this  change  in  size  of  the  plane  (see  the  first  and  last  frame).  Other 
types  of  affine  changes  in  the  shape  are  also  taken  care  of  within  the  proposed  framework 
without  having  to  explicitly  model  them.  Tracking  results  were  obtained  with  just  30 
particles. 


Figure  3;  Tracking  low-contrast  rapid  airplane  motion.  Segmentation  shown  with  black 


curve. 


We  described  a  fast  implementation  of  the  algorithm  which  greatly  improves  the 
computational  time  of  the  segmentation  process  [Ha,  Johnson,  Tannenbaum  2008].  We 
have  tested  particle  filtering  using  this  fast  active  contour  model,  and  the  filtering 
algorithm  has  shown  the  ability  to  robustly  track  an  aerial  target  under  varied  conditions. 
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Figures  4,  5,  and  6.  The  computational  speed  of  the  algorithm  has  allowed  us  to  employ 
it  for  formation  flight  among  several  unmanned  aircraft,  described  further  below  in  the 
flight  test  results  section.  We  have  also  demonstrated  the  utility  of  the  filtering  algorithm 
for  multiple  target  tracking  in  the  presence  of  occlusions. 


Figure  4:  Tracking  aerial  targets  against  clutter  using  particle  filtering,  honzontal  and 
vertical  distributions  of  target  location  probability  density  shown  on  edges. 


This  year  we  also  proposed  a  fast  implementation  of  the  Chan-Vese  active  contour  model 
that  improves  the  computational  speed  and  the  robustness  of  the  image  processing.  The 
computational  speed  of  tracking  using  the  fast  implementation  reaches  100 
frames/seconds  in  typical  tracking  scenes  from  several  flight  tests,  Figure  7. 


Figure  7;  Detecting  and  tracking  with  the  new  method.  The  two  images  on  the  left  show 
detecting  several  windows.  The  two  images  on  the  right  show  tracking  an  aerial  target  in 
cloudy  sky  (computational  speed:  100  frames/seconds). 
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Robust  Tracking  of  Deforming  Targets 

The  investigators  Yezzi  and  Soatto  have  incorporated  feedback  ideas  from  control  and 
estimation  theory  into  their  prior  framework  of  "deformotion"  (a  method  to 
simultaneously  track  the  motion  and  the  deformation  of  the  appearance  of  a  moving  and 
deforming  object  while  distinguishing  clearly  between  the  two  components  of  the 
changing  shape).  Prior  to  the  beginning  of  this  effort,  the  framework  was  used  for  the 
purpose  of  tracking  only  through  the  simple  notion  of  a  "moving  average"  via  the 
simultaneous  segmentation  and  registration  of  several  consecutive  frames  of  a  video 
sequence  rather  than  the  traditional  frame-by-frame  approach  typically  used  in  active 
contour  methodologies.  The  recent  incorporation  of  a  dynamical  model  for  the 
deformation  and  the  motion  components  of  this  framework  by  the  investigators,  thereby 
allowing  the  use  of  causal  observers,  has  lead  to  tremendous  increases  in  the  robustness 
of  the  tracker  to  even  severe  temporary  occlusions  of  the  object  being  tracked. 


In  Figure  8,  we  observe  the  contour  tracking  a  person  in  a  parking  lot  where  there  is  a 
black  vertical  band  of  missing  pixel  data  due  to  a  malfunction  of  the  digital  camera.  As 
the  person  passes  through  this  vertical  band,  almost  fiilly  occluded  at  one  point,  the 
dynamical  mode  helps  propagate  the  contour  through  the  occlusion  without  wildly 
spilling  or  distorting  or  losing  its  lock  by  the  time  the  person  reappears  on  the  other  side 
of  the  bar.  Also  in  Figure  8  we  see  even  more  severe  occlusion  taking  place  as  a  student 
walks  behind  a  very  large  printer,  becoming  fiilly  occluded  for  a  number  of  frames. 


Figure  8:  Tracking  through  occlusions  by  adding  a  dynamical  model  to  the  "deformotiof 
framework  (top)  behind  missing  pixel  data  (bottom)  behind  office  equipment. 


Optical  Flow 

In  the  second  year  of  the  program,  we  formulated  a  straightforward  approach  for 
predicting  and  estimating  large-amplitude  optical  flows.  The  optical  flow  model 
underpinning  the  proposed  algorithm  incorporates  temporal  coherence,  which  is  captured 
by  an  evolution  equation  to  provide  the  optimal  fusing  of  data  from  multiple  frames  of 
measurements.  It  allows  the  formulation  of  the  estimation  problem  as  a  state  estimation 
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problem,  which  can  be  efficiently  solved  by  Kalman  filtering.  Though  such  dynamic 
approaches  for  estimating  optical  flows  have  long  been  in  use,  our  proposed  approach  is 
innovative  in  that  it  shows  how  to  adapt  both  state  and  measurement  models  of  the 
Kalman  system  in  order  to  estimate  large-amplitude  optical  flows,  for  which  the 
linearized  modeling  (frequently  referred  to  as  the  differential  optical  flow  equation)  is 
known  to  fail.  It  is  done  by  modeling  the  optical  flow  as  a  composition  of  its  predecessor 
(i.e.,  a  time-delayed  version)  and  a  “complementary”  optical  flow.  Consequently,  the 
former  is  used  to  predict  the  current  optical  flow  and  to  pre-warp  the  images  to  be 
“connected”  using  this  prediction.  After  the  pre-warping  is  completed,  the  resultant 
images  are  employed  to  form  a  measurement  model  for  the  “complementary”  optical 
flow  and,  then  to  update  the  Kalman  estimate. 


Variational  Methods  for  Shape  from  Defocus: 

Our  group  has  successfully  tackled  the  problem  of  calibration  in  visual  accommodation. 
Visual  accommodation  is  the  process  of  extracting  three-dimensional  information  from 
images  obtained  by  averaging  different  exposures  obtained,  for  instance,  under  a 
changing  focal  length  (shape  from  defocus)  or  a  moving  scene  (shape  from  motion  blur). 
While  these  problems  are  classical  in  computer  vision  and  image  analysis,  all  algorithms 
published  so  far  required  knowledge  of  the  calibration  parameters  (aperture  of  the  lens, 
focal  lengths,  exposure  time  etc.)  in  order  to  return  a  correct  estimate.  In  practice,  this  is 
severely  limiting  since  it  requires  pre-calibration  of  the  imaging  device  following  a 
complex  protocol.  In  [Lu  et  al.,  2007],  we  have  characterized  the  set  of  all  possible 
surfaces  indistinguishable  from  deblurred  images:  They  are  simply  parameterized  by  an 
affine  transformation  of  the  inverse  depth,  where  the  affine  parameters  are  related  to  the 
calibration  of  the  focal  planes  by  simple  algebraic  relations.  We  have  showed  that  the 
presence  of  at  least  one  plane  in  the  scene  allows  disambiguating  the  reconstruction,  since 
planes  are  all  and  only  the  surfaces  that  are  invariant  under  affine  transformations  of  the 
inverse  depth,  finally,  we  have  showed  that  even  in  cases  where  the  correct 
reconstruction  cannot  be  performed,  one  can  still  recover  a  deblurred  version  of  the 
original  data. 

Analysis  of  the  Ambiguities  in  Motion  Analysis: 

In  [Vedaldi  et  al.  2007]  we  have  proven  a  series  of  theorems  that  relate  to  the  problem  of 
reconstructing  3-D  camera  motion  (ego-motion)  from  collections  of  images  or  optical 
flow.  It  is  well  known  that  ego-motion  estimation  can  be  posed  as  an  optimization 
problem,  one  that  is  non-linear  and  non-convex,  and  that  is  subject  to  the  presence  of 
many  local  minima.  It  is  also  known  that  the  shape  of  the  L2  residual  surface  is  littered 
with  singularities,  that  pull  the  cost  function  and  cause  a  large  number  of  local  minima  in 
the  forward  direction,  ^at  is  when  the  translation  vector  is  aligned  with  the  optical  axis. 
This  is  arguably  the  most  important  case  for  AFOSR  applications,  and  for  the  use  of 
vision  as  a  sensor  for  navigation  in  general.  In  this  most  recent  work,  we  have  proven  a 
theorem  that  shows  that  if  the  inverse  depth  is  bounded  away  from  zero  during  the 
optimization  of  the  L2  residual,  singularities  in  the  forward  direction  disappear,  and  the 
L2  residual  is  actually  a  smooth  function.  This  effectively  replaces  a  non-smooth 
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unconstrained  optimization  with  a  smooth  constrained  one,  and  improves  the  results  of 
motion  estimation  algorithms. 


Model  Based  Radiance  Estimates  for  Segmentation 

Sometimes  contrast  between  average  foreground  and  average  background  intensities 
makes  the  segmentation  task  one  of  the  easier  aspects  of  the  overall  tracking  algorithm. 
However,  very  few  objects  can  be  specified  with  nearly  constant  radiances  that  differ 
sharply  in  a  scene  with  nearly  constant  background  radiance.  While  more  sophisticated 
radiance  models  have  existed  for  a  while  (the  classic  one  being  the  piecewise  smooth 
model  used  in  the  Mumford-Shah  functional),  they  have  been  computationally  very 
expensive,  making  their  utility  limited.  We  explored  methods  to  substantially  reduce  the 
computational  complexity  of  more  flexible  radiance  models,  with  two  very  promising 
leads  that  may  make  their  applicability  to  visual  tracking  much  more  plausible.  The  first 
lead  deals  with  a  dimensionality  reduction  technique,  based  on  training  data,  applied  to 
both  shape  and  radiance  measurements  accrued  from  prior  images  of  related  targets  to  be 
tracked.  The  second  lead  deals  with  an  approximation  of  the  class  of  piecewise  smooth 
functions  by  basis  functions  generated  by  convolution  of  the  input  image  with  families  of 
low-pass  filters.  We  can  see  the  power  of  these  more  flexible  radiance  models  for 
tracking  in  Fig.  9  where  we  are  trying  to  track  a  person's  head.  Notice  that  the  face 
would  be  very  poorly  approximated  by  a  simple  constant  or  nearly  constant  radiance. 


Figure  9:  Adding  piece-wise  smooth  radiance  to  deformotion  method  to  tracking  objects 
with  non-trivial  albedos.  (Top  Row)  Tracking  results.  (Bottom  Row)  Piecewise  smooth 
radiance  models  used  for  the  above  tracking  results  (no  edge  detectors  used). 


Active- Vision  Control  Systems  MURl  Final  Report 


13 


Shape-Driven  Observer  Theory  for  Tracking: 

We  have  proposed  a  deterministic  observer  framework  for  visual  tracking  based  on  non- 
parametric  implicit  (level-set)  curve  descriptions  [Niethammer,  Vela,  Tannenbaum, 
2008].  The  observer  is  continuous-discrete,  with  continuous-time  system  d)mamics  and 
discrete-time  measurements.  Its  state-space  consists  of  an  estimated  curve  position 
augmented  by  additional  states  (e.g.,  velocities)  associated  with  every  point  on  the 
estimated  curve.  Multiple  simulation  models  are  proposed  for  state  prediction. 
Measurements  are  performed  through  standard  static  segmentation  algorithms  and 
optical-flow  computations.  Special  emphasis  is  given  to  the  geometric  formulation  of  the 
overall  d)mamical  system.  The  discrete-time  measurements  lead  to  the  problem  of 
geometric  curve  interpolation  and  the  discrete-time  filtering  of  quantities  propagated 
along  with  the  estimated  curve.  Interpolation  and  filtering  are  intimately  linked  to  the 
correspondence  problem  between  curves.  Correspondences  are  established  by  a  Laplace- 
equation  approach.  The  proposed  scheme  is  implemented  completely  implicitly  (by 
Eulerian  numerical  solutions  of  transport  equations)  and  thus  naturally  allows  for 
topological  changes  and  subpixel  accuracy  on  the  computational  grid. 

Local  Region-Based  Segmentations: 

We  developed  a  natural  framework  that  allows  any  region-based  segmentation  energy  to 
be  re-formulated  in  a  local  way  [Lankton,  Tannenbaum  2008].  By  considering  local 
rather  than  global  image  statistics  and  evolving  a  contour  based  on  local  information, 
localized  contours  are  capable  of  segmenting  objects  with  heterogeneous  feature  profiles 
that  would  otherwise  be  difficult  to  capture  correctly  just  using  a  standard  global  method. 
The  technique  is  versatile  enough  to  be  used  with  any  global  region-based  active  contour 
energy  and  instill  in  it  the  benefits  of  localization.  We  have  demonstrated  the  localization 
of  three  well-known  energies  in  order  to  illustrate  how  our  framework  can  be  applied  to 
any  energy.  The  results  we  have  obtained  on  challenging  images  to  illustrate  the  robust 
and  accurate  segmentations  that  are  possible  with  this  new  class  of  active  contour  models. 

Shape,  Scale  and  Registration 

An  important  issue  that  needs  to  be  addressed  in  image-based  models  is  that  of  scale. 
Because  targets  can  appear  at  any  scale,  depending  on  their  position  relative  to  the  sensor, 
and  yet  resolution  imposes  a  lower  bound  on  detectable  structures,  algorithms  have  to 
operate  at  multiple  scales  of  resolution.  On  the  other  hand,  some  objects  are  detectable 
only  a  certain  scale,  which  defines  the  statistics  that  make  it  detectable  (see  images  of  a 
cheetah  below).  We  have  developed  novel  techniques  to  classify  image  regions  based  on 
automatic  scale  detection.  We  believe  that  this  is  a  crucial  step  in  texture  analysis  and 
will  play  an  important  role  in  perspective  when  complex  camouflaged  targets  will 
become  manageable  in  real-time  fashion. 

Another  important  problem  arises  from  the  fact  that  often  the  object  of  interest  (template) 
appears  rather  different  from  the  actual  target  when  embedded  in  real  scenes.  Therefore, 
standard  cost  criteria  traditionally  used  in  deformable  templates  often  yield  catastrophic 
failure  in  tracking  and  registration  algorithms.  Recently,  there  has  been  a  resurgence  of 
information-theoretic  criteria  for  tracking,  segmentation  and  registration,  driven  in 
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particular  by  medical  imaging  applications  where  the  benefit  of  multi-modal  registration 
is  immediately  obvious.  We  have  developed  techniques  to  perform  multi-modal 
registration  of  images  affected  by  significant  distortions  in  the  multi-modal  data 
collection  process.  For  instance,  in  registering  anatomical  atlases  to  gene-expression  data, 
one  attempts  to  put  in  correspondence  different  objects  that  -  without  significant  amount 
of  prior  knowledge  at  hand  -  appear  to  have  little  in  common.  The  schemes  we  develop 
are  “low-level,  bottom-up”  algorithms  that  do  not  require  explicit  domain  knowledge, 
and  are  therefore  portable  to  other  domains  [Yi-Soatto,  2008]. 

Estimation  Problem  for  Moving  Airborne  Object  Tracking 

We  have  developed  airborne  target  tracking  algorithms  for  use  on  UAVs  equipped  with 
monocular  based  imaging  systems.  The  UAV  is  to  track  an  object/target  within  its  field 
of  view,  requiring  an  estimate  of  target  relative  position.  Unfortunately,  due  to  the 
camera  projection  equations,  the  recovery  of  range  is  an  ill-posed  problem  for  monocular 
imaging  systems.  To  overcome  this,  several  approaches  have  been  investigated. 

The  standard  EKF  for  range  estimation  has  traditionally  been  performed  with  knowledge 
of  target  bearing  only,  known  as  bearings-only  range  estimation.  The  algorithm  estimates 
relative  range,  line-of-sight  angle  (LOS),  and  LOS  rate  using  the  visual  information 
obtained  from  an  on-board  camera.  Range  is  unobservable  except  during  certain 
maneuvers,  and  Leader  accelerations  can  cause  an  EKF  to  diverge.  Fortunately,  the 
image  of  the  Leader  provides  indirect  observation  of  the  range  through  measurement  of 
target  size  in  the  imaging  plane.  The  size  of  the  target  is  defined  to  be  the  longest  axis  of 
the  plane  (typically  approximately  the  wing  span).  Measuring  the  angle  subtended  by  the 
Leader  in  the  image  plane  renders  range  observable.  The  EKF  is  augmented  with  an 
additional  target-size  state  to  utilize  the  subtended  angle  information. 

Guidance  for  Formation  Flight 

The  vision-based  estimation  filter  in  conjunction  with  active  contours  can  be  used  to 
implement  the  tracking  problem  described  above.  To  do  so,  range  and  line  of  sight 
estimates  are  compared  to  desired  values  and  converted  into  control  commands  for  the 
UAV.  The  control  commands  are  obtained  using  standard  guidance  and  pursuit  laws. 
Referring  to  Figure  10,  in  this  scenario,  each  UAV  may  follow  several  other  vehicles,  and 
may  have  more  than  one  desired  relative  range  and  angle.  However,  the  Leader-Follower 
guidance  algorithm  has  only  two  parameters:  desired  relative  range  and  desired  relative 
angle.  This  leads  one  to  average  the  desired  ranges  and  desired  relative  angles  for  each 
UAV  (similar  to  idea  of  averaging  the  pseudo-control  in  [17]). 
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Figure  10:  5-Ship  formation  2D  simulation  (Left)  planar  trajectories  (Right)  range  errors 


We  have  also  explored  guidance  solutions  that  allow  groups  of  unmanned  vehicles  to 
move  in  some  coordinated  fashion  while  avoiding  fixed  obstacles.  When  the  target  and 
obstacle  size  are  assumed  known,  the  range  can  be  computed  from  the  geometric 
relationship  involving  the  subtended  angle,  object  size  and  range.  In  this  case,  we  have 
used  an  adaptive  neural  network  (NN)  to  directly  estimate  the  range-rate  of  the  target 
from  range  and  angular  measurements  of  the  LOS  vector.  The  output  of  the  NN  is  used 
in  the  guidance  policy  for  pursuing  the  target.  In  the  past  year,  we  have  also  explored  a 
more  decentralized,  leaderless  formation  scheme.  In  the  latter  scheme,  each  vehicle  in 
the  formation  implements  a  guidance  policy  that  is  a  blend  of  waypoint  tracking, 
formation  control  and  obstacle  avoidance.  The  scheme  increases  flexibility  of  the 
formations  by  allowing  transitions  in  the  formation  shape  and  reducing  dependency  on  a 
single  vehicle  (leader).  Figure  1 1 . 


Figure  1 1 :  Leaderless  formation 
(Left)  with  obstacles  (Right)  transition  to  a  line-formation 


The  EKF  can  produce  biased  estimates  of  the  range  due  to  the  unknown  target 
acceleration.  We  have  studied  various  ways  of  improving  the  estimator.  One  way  is  to 
construct  an  optimal  guidance  policy  by  minimizing  the  variance  of  the  range  estimation 
error  in  the  EKF  design.  The  guidance  policy  results  in  maneuvers  perpendicular  to  the 
LOS  vector.  Another  line  of  work  has  involved  modeling  the  target  acceleration  as 
Gauss-Markov  random  processes.  While  this  helped  in  reducing  the  bias  in  the  range 
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estimate,  we  cannot  capture  various  target  accelerations  with  a  fixed  model.  A  third  line 
of  research  involves  augmenting  the  EKF  with  an  adaptive  NN  (EKF  +  NN)  that 
produces  an  estimate  of  the  unknown  target  acceleration.  The  NN  trains  on  the  residuals, 
i.e.,  the  error  between  the  image  plane  measurements  and  the  EKF  estimates  of  these 
measurements.  A  typical  result  is  illustrated  in  Figure  12. 
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Figure  12:  Range  estimation  results 
(Left)  basic  extended  Kalman  filter  (Right)  adding  a  neural  network 

We  have  previously  applied  output  feedback  control  with  a  linear  reference  model  to  the 
guidance  problem.  A  linear  reference  model  was  designed  for  the  relative  motion.  The 
linear  output  feedback  controller  was  augmented  with  an  adaptive  element  to  compensate 
for  matched  uncertainties.  A  method  has  recently  been  explored  for  tracking  control 
design  using  a  nonlinear  reference  model.  The  relative  motion  of  the  system  dynamics  in 
the  absence  of  uncertainties  defines  an  open  loop  nonlinear  reference  model.  The  loop 
around  the  reference  model  is  closed  via  backstepping  technique,  thus  defining  a 
nonlinear  closed-loop  reference  model.  The  backstepping  controller  is  augmented  with 
an  adaptive  element  and  applied  to  the  nonlinear  dynamics  of  the  relative  motion.  The 
error  dynamics  in  its  structure  are  similar  to  the  previous  work,  but  the  unmatched 
uncertainty  is  different  and  is  less  in  norm.  The  states  of  the  relative  motion  including 
the  relative  range  have  also  been  estimated  using  the  adaptive  observer  from  [18]. 


Guidance  for  Obstacle  A  voidance 

A  Note  on  Obstacle  Detection:  For  obstacle  avoidance,  it  is  sufficient  to  concentrate  on 
detecting  the  edges  of  obstacles.  When  we  restrict  the  obstacle  and  camera  motion  to  a 
2D  plane,  each  obstacle  appears  as  a  straight  line  in  the  camera  image  plane,  and  its  edges 
are  two  endpoints  of  the  line  in  the  image.  Those  edges  can  be  detected  as  discontinuities 
in  the  optical  flow  field.  Therefore,  it  is  assumed  that  optic  flow  is  used  to  rapidly  detect 
the  endpoints  of  all  obstacles. 

Estimating  Time-to-Go  (tgo)  and  Zero  Effort  Miss  (ZEM)  Distance:  An  EKF  was 

designed  to  estimate  the  relative  position  of  each  obstacle  edge  point  with  respect  to  the 
UAV  from  its  image  position  measurement.  However,  in  the  case  of  moving  obstacles, 
unmodeled  dynamics  due  to  the  unknown  obstacle  motion  (target  acceleration)  may 
produce  biased  estimates  or  even  cause  the  EKF  to  diverge.  The  estimates  are  improved 
by  augmenting  the  EKF  with  an  adaptive  NN  that  compensates  for  estimation  errors  due 
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to  the  unmodeled  dynamics  and  other  nonlinearities  [Sattigeri,  Calise,  Evers  2003].  The 
estimates  of  /go  and  ZEM  are  obtained  from  the  estimates  of  relative  position. 

Furthermore,  estimates  of  the  absolute  positions  of  all  obstacle  edge  points  can  be 
calculated  by  using  the  relative  position  estimates  and  known  camera  motion. 

Selecting  the  Most  Critical  Point:  If  an  obstacle  edge  satisfies  both  t^^  <  and 

ZEM  <  c/jnin ,  the  UAV  must  maneuver  to  avoid  the  obstacle.  The  edge  point  which 
satisfies  the  conditions  above  and  has  the  smallest  tg^  is  chosen  as  the  most  critical 
point. 

Guidance  Law  for  Obstacle  Avoidance:  A  guidance  law  for  obstacle  avoidance  was 
developed  based  on  PN  guidance.  The  vehicle  has  to  avoid  the  most  critical  point  while 
minimizing  a  deviation  from  the  planned  path.  Therefore,  a  point  of  minimum  separation 
from  the  most  critical  point  is  identified,  and  the  vehicle  is  steered  to  that  point  using  PN 
guidance.  At  the  same  time,  a  guidance  command  for  tracking  the  planned  path  is 
created  from  the  known  vehicle  motion.  These  two  commands  are  blended  with  a 
weighting  function  to  arrive  at  a  net  guidance  command.  The  weighting  function 
depends  on  tg^  so  that  obstacle  avoidance  is  given  greater  priority  as  tg^  decreases. 

Figure  13  depicts  a  simulation  result  of  vision-based  obstacle  avoidance  using  the 
algorithms  described  above.  The  nominal  path  is  a  straight  line  along  Y=0.  The  vehicle 
has  constant  speed  and  is  controlled  by  its  turning  rate.  We  are  currently  examining  the 
effect  of  moving  obstacles,  and  evaluating  the  potential  benefit  of  augmenting  the  EKF 
with  an  adaptive  element. 


Figure  13:  Vehicle  Trajectory  for  Point  Obstacles  (left)  and  Line  Obstacles  (Right). 


A  2-D  Fixed  Object  Tracking  Method:  An  obstacle  avoidance  algorithm  that  utilizes 
information  from  a  2-D  passive  vision  sensor  was  investigated.  It  is  assumed  that  a  path¬ 
planning  algorithm  provides  a  trajectory  that  an  aircraft  has  to  follow.  However,  there 
are  unforeseen  obstacles  that  must  be  negotiated  along  the  path,  requiring  a  deviation. 
An  EKF  was  designed  to  estimate  the  relative  position  of  each  obstacle  edge  point  with 
respect  to  the  aircraft  from  its  image  position  measurement.  If  an  obstacle  edge  satisfies 
both  a  time  to  closest  approach  and  zero-effort  miss  criteria,  then  the  aircraft  must 
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maneuver  to  avoid  the  obstacle.  The  edge  point  that  satisfies  the  conditions  above  and 
has  the  smallest  time  to  closest  approach  is  chosen  as  the  most  critical  point.  The  vehicle 
has  to  avoid  the  most  critical  point  while  minimizing  a  deviation  from  the  planned  path. 
Therefore,  a  point  of  minimum  separation  from  the  most  critical  point  is  identified,  and 
the  vehicle  is  steered  to  that  point.  A  “minimum  effort”  guidance  law  has  been 
developed  in  this  last  year,  which  has  greatly  improved  vision-based  obstacle  avoidance 
metrics,  figure  14. 


Figure  14:  Vehicle  trajectory  and  commanded  with  and  without  minimum  effort  guidance 
to  avoid  obstacles  (minimum  effort  solid  lines,  conventional  PN  guidance  dashed); 
reduced  peak  acceleration  required,  and  results  in  a  smoother/safer  trajectory 


Stochastically  Optimized  Guidance  Design:  It  is  well-known  that  vision-based  estimation 
performance  highly  depends  on  the  relative  motion  of  the  vehicle  to  the  target.  The 
stochastically  optimized  guidance  design  for  vision-based  control  applications  has  been 
investigated  [Watanabe  et.al.  2006].  An  extended  Kalman  filter  (EKF)  is  applied  to  the 
relative  state  navigation.  The  guidance  policy  is  derived  by  minimizing  the  expected 
value  of  a  sum  of  guidance  error  and  control  effort  subject  to  the  EKF  procedures. 
Furthermore,  a  one-step-ahead  suboptimal  optimization  technique  has  been  developed 
and  implemented  to  avoid  iterative  computation.  The  approach  is  applied  to  vision-based 
target  tracking  and  obstacle  avoidance.  Simulation  results  verified  that  the  suggested 
guidance  law  significantly  improves  the  estimation  performance,  and  hence  improves  the 
overall  guidance  performance.  Figure  15  [Watanabe  et.al.  2007]. 
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Figure  15:  (Left)  Vehicle  trajectories  comparing  suggested  suboptimal  guidance  and 
conventional  guidance  for  vision-based  target  tracking,  terminal  miss  distance  is 
significantly  reduced;  (Right)  Estimation  error  converges  to  zero  when  using  the 
suboptimal  guidance  (Conventional  green  dashed  lines,  Suboptimal  solid  blue) 


UKF  and  EMPF-based  Visual  Tracking  Systems:  We  have  developed  an  Unscented 
Kalman  Filter  (UKF)  approach  to  the  highly  nonlinear  vision-based  estimation  problem 
[Oh  and  Johnson  2007].  We  have  also  explored  particle  filtering  for  this  purpose.  While 
particle  filters  have  many  attractive  features  including  their  applicability  to  general 
nonlinear,  non-Gaussian  problems  without  approximations  of  noise  probability 
distributions,  they  also  suffer  from  some  defects.  The  most  serious  defect  might  be  the 
increasing  computational  cost  in  high-dimensional  state-space  models  because.  One 
technique  to  surmount  this  problem  without  reducing  the  efficiency  of  sampling 
techniques  is  to  reduce  the  dimension  of  the  state  space  model  by  marginalizing  out  some 
of  the  state  variable  components.  Since  the  vision-based  tracking  problem  can  only  be 
completely  described  by  a  relatively  high-dimensional  state-space  model,  direct 
employment  of  the  particle  filtering  on  this  problem  is  almost  impossible  because  an 
enormous  number  of  samples  are  required  to  properly  approximate  the  posterior 
distributions.  Hence,  the  idea  of  marginalization  (or  Rao-Blackwellization)  is  extended 
to  solve  this  problem  in  the  framework  of  an  extended  marginalized  particle  filter 
(EMPF).  In  this  approach,  while  part  of  the  state  components  are  represented  by 
nonlinear  dynamics  with  Gaussian  process  noise,  those  state  components  can  be 
effectively  marginalized  out  by  employing  the  UKF  to  deal  with  those  state  components.' 
The  idea  utilizes  the  reasoning  that  the  UKF  can  more  accurately  and  effectively  solve 
the  nonlinear  estimation  problems  with  Gaussian  noise  characteristics  compared  to  the 
EKF.  Since  vision  sensor  measurements  can  better  be  represented  by  the  non-Gaussian 
noise  characteristics  and  the  vision  information  itself  directly  provides  the  position 
information  only  (and  not  directly  but  indirectly  the  velocity  and  acceleration  information 
over  the  progression  of  time),  only  the  position  state  components  with  measurements  of 
vision  information  are  solved  in  the  particle  filtering  framework.  The  other  state 
components  represented  by  nonlinear  equations  with  Gaussian  noise  are  handled  by  the 
UKF,  Figure  16.  This  approach  can  be  easily  extended  to  the  design  of  a  vision-based 
tracking  system  that  incorporates  probabilistic  non-Gaussian  vision  information. 


Active-Vision  Control  Systems  MURl  Final  Report 


20 


Figure  16:  Target  position  estimation  (left)  and  target  velocity  estimation  (right)  using 
the  onboard  image  processing  results  obtained  during  flight  testing  on  June  1 5,  2006. 

Image  processing  results  are  post-processed  to  get  the  vision-based  relative  motion 
estimation  in  the  framework  of  the  EMPF.  GPS/INS  results  are  independently  recorded 
from  onboard  integrated  navigation  systems  during  the  flight  test  for  comparison. 

Adaptive  Estimation:  Our  previous  contributions  have  included  approaches  to  adaptive 
estimation,  and  this  year  the  estimation  design  presented  in  [Sattigeri  et.al.  2006]  has 
been  validated  in  the  real-time  Georgia  Tech  Unmanned  Systems  Testbed  (GUST) 
simulation  software,  which  is  the  final  step  before  flight  testing.  In  GUST  the  adaptive 
estimation  design  is  integrated  with  image  processing,  guidance  and  control  algorithms, 
allowing  vision-in-the-loop  formation  flight  to  been  demonstrated  in  a  software-in-the- 
loop  environment.  Figure  17(a)  shows  the  leader  aircraft  in  the  formation  flight 
simulation.  The  leader  aircraft  is  turning  in  a  circle  at  a  steady  rate  in  the  horizontal 
plane.  Figure  17(a)  is  a  screenshot  of  the  frame-grabber  window,  which  is  used  by  the 
image  processing  to  track  the  leader  aircraft.  The  image  processing  returns  the  location 
of  the  center  of  the  leader  aircraft  (green  crosshair)  and  the  wing-tips  (red  crosshairs) 
which  are  used  to  compute  the  LOS  and  subtended  angles  in  the  image  plane.  Figure 
17(b)  shows  the  leader  acceleration  estimation  performance  of  the  adaptive  neural 
network  (NN)  augmenting  the  nominal  estimator.  The  nominal  estimator  is  a  linear, 
time-varying  Kalman  filter  wherein  the  leader  acceleration  components  along  the  inertial 
axes  are  modeled  as  independent  zero-mean,  white  noise  processes.  The  NN  does  a  very 
good  job  of  estimating  the  unmodeled  leader  acceleration.  In  the  absence  of  adaptation, 
there  is  no  compensation  for  the  leader  acceleration  in  the  estimator  design.  This  causes 
the  leader  aircraft  to  drift  out  of  the  field-of-view  of  the  follower  vision  sensor  and 
ultimately  vision  formation  cannot  be  maintained  in  the  absence  of  adaptation  (not 
shown).  Flight  test  results  are  expected  in  the  near  future,  and  may  be  presented  at  the 
meeting. 
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(a)  (b) 

Figure  17;  Formation  Flight  of  adaptive  estimator  (a)  Frame-Grabber  window  showing 
output  of  Image  Processing  (b)  Leader  Acceleration  Estimation  Performance  with 

Adaptive  NN  (ft/s^) 


Adaptive  Disturbance  Rejection  Controller  for  Visual  Tracking 

An  adaptive  disturbance  rejection  control  architecture  is  developed  in  [Stepanyan  and 
Hovakimyan  GNC  2005]  for  a  flying  vehicle  to  track  a  maneuvering  target  using  a 
monocular  camera  as  a  visual  sensor.  The  kinematic  equations  of  relative  motion  are 
formulated  in  the  body  frame  of  the  tracking  vehicle,  in  which  the  target  velocity  is 
viewed  as  a  time-varying  disturbance  that  is  assumed  to  be  in  the  form  of  a  constant  term 
plus  a  time-varying  term  with  bounded  integral  of  the  magnitude.  This  means  that  any 
maneuver  made  by  the  target  is  such  that  the  velocity  returns  to  some  constant  value  in 
finite  time  or  asymptotically  in  infinite  time  with  a  rate  sufficient  for  the  integral  of  the 
magnitude  of  velocity  change  be  finite.  For  example,  any  obstacle  or  collision  avoidance 
can  be  viewed  as  such  a  maneuver.  The  challenge  associated  with  the  unobservable 
relative  range  leads  to  a  reference  model,  dependent  upon  the  unknown  constant 
parameter  associated  with  the  target  size.  In  the  meantime  the  problem  is  complicated 
with  the  presence  of  unknown  time-varying  disturbances  associated  with  the  unknown 
target's  velocity.  Thus  two  challenges  are  addressed  simultaneously:  tracking  of  a 
reference  command  that  has  an  unknown  parameter  in  it,  and  disturbance  rejection 
problem  for  a  multi-input  multi-output  system  with  positive  but  unknown  high  frequency 
gain  in  each  control  channel.  The  proposed  guidance  law  uses  the  adaptive  synthesis 
approach  developed  in  [19]  for  rejecting  the  time- varying  disturbances  and,  as  a  result, 
guarantees  asymptotic  tracking  of  estimated  reference  commands. 

Tracking  of  the  true  reference  commands  requires  identification  of  the  target's  size,  which 
otherwise  requires  convergence  of  the  parameter  estimates  to  the  true  value.  This  has 
been  achieved  by  introducing  intelligent  excitation  technique,  [Cao  and  Hovakimyan 
ACC  2005].  Following  this  method,  a  sinusoid  with  amplitude  depending  on  the  tracking 
error  is  introduced  in  the  estimated  reference  command.  This  ensures  simultaneous 
parameter  convergence  and  output  regulation. 
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The  limitations  imposed  on  the  target  motion  can  be  removed  by  considering  the  visual 
tracking  problem  in  output  feedback  framework  that  involves  the  target's  acceleration, 
rather  than  the  velocity.  In  this  case,  the  acceleration  is  assumed  to  be  piece-wise 
continuous  and  bounded,  but  otherwise  with  unknown  bounds.  The  system  in  this 
general  case  is  of  vector  relative  degree,  has  in  general  time-varying  unknown  parameters 
associated  with  the  target's  geometry  and  bounded  unknown  disturbance  associated  with 
the  target's  acceleration.  The  reference  model  still  depends  on  the  unknown  parameters, 
and  perfect  tracking  can  be  achieved  if  the  parameter  estimates  converge  to  the  true 
values. 

To  handle  this  problem,  a  robust  adaptive  observer  design  methodology  was  developed 
[Stepanyan  and  Hovakimyan  CDC  2005]  for  a  class  of  uncertain  nonlinear  systems  in  the 
presence  of  time  varying  unknown  parameters  and  non-vanishing  disturbances.  Using 
universal  approximation  property  of  radial  basis  function  neural  networks  and  the 
adaptive  bounding  technique,  the  developed  observer  achieves  asymptotic  convergence 
of  state  estimation  error  to  zero,  while  ensuring  boundedness  of  parameter  errors. 
However,  the  methodology  requires  existence  of  an  output  injection  matrix  that  makes 
the  linear  part  to  be  SPR-like.  The  latter  condition  is  very  challenging  to  ensure  in  visual 
tracking  and  requires  input-output  filtering  and  state  transformations,  like  ones  developed 
in  [20]  and  [21]. 

LI  Adaptive  Estimation  and  Control: 

Given  the  visual  measurement  of  the  target  and  the  relative  altitude  (such  as  by  geo- 
referencing  the  image,  captured  by  the  onboard  gimbaled  camera,  with  a  given  database  - 
such  as  for  a  ground-target),  the  estimation  problem  was  formulated  in  a  way  such  that 
the  recently-developed  L,  fast  estimator  can  be  applied  for  the  target’s  time-varying 
velocity  estimation  [Dobrokhodov  et.  al.  2007].  Arbitrary  small  estimation  precision  and 
transient  response  can  be  obtained  by  increasing  the  bandwidth  of  the  low-pass  filter  used 
in  the  L,  fast  estimator.  The  trade-off  is  that  increasing  the  bandwidth  requires  larger 
adaptation  rate  and  faster  computation.  The  performance  bound  from  disturbance/noise 
in  the  measurements  to  the  estimation  error  is  systematically  derived,  which  explicitly 
accounts  for  out-of-frame  events  following  the  analysis  on  brief  instabilities. 

Closed-Loop  Image  Processing,  Guidance,  Navigation,  and  Control  Simulation 

After  achieving  initial  verification  of  the  individual  tracking  algorithms  on  specially 
tailored  simulations,  we  incorporated  the  components  into  a  real-time  simulation  of  two 
airplanes,  including  its  guidance  and  control  functions.  The  complete  system,  including 
image  processing,  estimation,  and  guidance  have  been  implemented  and  tested  in  this 
way.  A  scene  generator  is  used  as  the  input  to  image  processing  approaches.  A  typical 
result  is  shown  in  Figure  18. 
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Figure  18:  Closed-loop  vision-based  formation  flight  high-fidelity  simulation  results 
(Left)  Relative  position  command,  estimated,  and  actual  for  changed  in  relative  position 
command  (Right)  Raw  image  with  segmentation  results  overlay 


The  adaptive  estimation  design  described  above  for  vision  based  formation  flight  was 
been  validated  in  the  same  real-time  simulation  [Sattigeri,  Johnson,  Calise  and  Ha  2007]. 
Here,  the  adaptive  estimation  design  is  integrated  with  image  processing,  guidance  and 
control  algorithms,  allowing  vision-in-the-loop  formation  flight  to  be  demonstrated  in  a 
software-in-the-loop  environment.  Open-loop  results  with  synthetic  imagery  and 
recorded  flight  test  data  were  obtained  first.  Figure  19  shows  results  obtained  by  post¬ 
processing  recorded  flight  test  data,  specifically  leader  velocity  and  position  estimation 
performance  with  the  adaptive  estimator. 


Figure  19.  Leader  Position  Estimation  Performance  (ft),  with  Adaptive  Estimation 
-  (Left)  Velocity  North,  East,  Down  (Right)  Position  North,  East,  Down, 

leader  flies  in  a  circle 


An  adaptive  integrated  guidance  and  control  design  developed  for  line-of-sight  formation 
flight  was  also  integrated  and  tested  in  the  simulation  [Sattigeri,  Johnson,  Calise  2008]. 
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Adaptive  Vision-Based  Guidance  Law  with  Guaranteed  Performance  Bounds  for 
Tracking  a  Ground  Target  with  Time-  Varying  Velocity 

This  work  extends  early  results  on  vision-based  tracking  of  a  ground  vehicle  moving  with 
unknown  time-varying  velocity.  The  follower  UAV  is  equipped  with  a  single  camera. 
The  control  objective  is  to  regulate  the  2D  horizontal  range  between  the  UAV  and  the 
target  to  a  constant.  Figure  20  shows  graphical  illustration  of  the  vision-based  target 
tracking  scenario.  Let  pC?)  denote  the  2D  horizontal  range  between  the  UAV  and  the 

target.  The  control  objective  is  to  regulate p(0  top^,  where  pj  is  a  given  desired  2D 
horizontal  range  between  the  UAV  and  the  target.  For  simplicity,  we  consider  the  case 
when  pj  is  constant.  For  this  system,  the  available  measurements  are  visual 

measurements  of  the  target  location  within  a  2D  image,  relative  altitude  by  comparison  to 
terrain  database,  and  ownship  state. 


Figure  20:  Relative  kinematics  of  UAV-target  motion. 


The  extension  has  two  distinct  features  [Ma,  et.  al.  GNC  2008].  An  earlier  developed 
guidance  law  used  the  estimates  of  the  target's  velocity  obtained  from  a  fast  estimation 
scheme.  We  explicitly  derive  the  tracking  performance  bound  as  a  function  of  the 
estimation  error.  The  performance  bounds  imply  that  the  signals  of  the  closed-loop 
adaptive  system  remain  close  to  the  corresponding  signals  of  a  bounded  closed-loop 
reference  system  both  in  transient  and  steady-state.  The  reference  system  is  introduced 
solely  for  the  purpose  of  analysis.  This  work  also  analyzes  the  stability  and  the 
performance  degradation  of  the  closed-loop  adaptive  system  in  the  presence  of  out-of¬ 
frame  events,  when  continuous  extraction  of  the  target's  information  is  not  feasible  due  to 
failures  in  the  image  processing  module.  The  feedback  loop  is  then  closed  using  the 
frozen  estimates.  The  out-of-frame  events  are  modeled  as  brief  instabilities.  A  sufficient 
condition  for  the  switching  signal  is  derived  that  guarantees  graceful  degradation  of 
performance  during  target  loss.  The  results  build  upon  the  earlier  developed  fast 
estimation  scheme  of  the  target's  velocity,  the  inverse-kinematics-based  guidance  law  and 
insights  from  switching  systems  theory. 
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Flight  Testing  and  Sensor  Development 

Flight  Testing 

A  helicopter  UAV  with  automated  capabilities  that  include:  searching  a  prescribed  area, 
identifying  a  specific  building  within  that  area  based  on  a  small  sign  located  on  one  wall, 
and  then  identifying  an  opening  into  that  building  was  developed  and  tested.  Results 
include  successful  evaluation  at  the  McKenna  Military  Operations  in  Urban  Terrain  flight 
test  site.  Active  contours  were  used  to  locate  openings.  Figure  21.  In  a  separate/related 
activity,  the  contours  such  as  those  in  the  figure  were  successfully  used  to  update  an 
inertial  navigation  solution,  allowing  the  vehicle  to  operate  without  GPS  or  other  aiding 
for  extended  periods  [Proctor,  et.  al.  2003]. 


Figure  21:  Flight  test  results  segmenting  openings  into  a  building 

A  glider,  which  is  capable  of  flying  from  a  starting  point  to  a  pre-defmed  ending  location 
using  only  a  single  vision  sensor,  has  been  flight  tested  [Proctor,  et.  al,  2006].  The 
estimator  uses  an  EKF.  The  algorithms  are  tested  with  a  glider  instrumented  only  with  a 
single  camera. 

We  validated  the  vision-based  segmentation,  estimation,  guidance,  and  control  strategy 
developed  previously  in  flight  test.  On  June  15,  2006,  the  project  had  a  particularly 
significant  highlight.  One  of  our  research  aircraft  held  formation  for  an  extended  period 
with  another  aircraft,  utilizing  a  vision  sensor  as  its  only  indication  of  the  state  of  the 
other  aircraft.  The  Leader  aircraft  was  a  1/3  scale  Edge  540T  with  a  GPS/INS  based 
autopilot,  flying  in  a  large  circular  pattern  over  our  test  range  at  slow  speed.  The 
Follower  aircraft  was  the  GTMax  (based  on  the  Yamaha  RMAX)  research  helicopter, 
utilizing  onboard  image  processing,  lead  aircraft  state  estimation,  guidance,  and  control. 
On  engagement,  the  follower  held  formation  for  approximately  two  full  "orbits"  of  the 
test  range  in  a  shallow  turn  -  encountering  a  variety  of  lighting  and  wind/gust  conditions. 
This  may  have  been  the  first  time  automated  formation  flight  based  on  vision  has  been 
done.  Segmentation  and  estimation  data  are  shown  below  in  Figures  22  and  23. 
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flight).  Lead  aircraft  center  and  wing  tip  positions  are  found  in  realtime  (graphically 
shown  with  a  “+”)>  and  this  data  utilized  to  estimate  the  position,  velocity,  acceleration, 
and  size  of  the  Leader.  The  estimated  position,  velocity,  and  acceleration  are  utilized  to 

fly  formation. 
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Figure  23:  Position  estimation  (left)  and  position  estimation  error  (right)  during  one  of 
the  flights  on  June  15,  2006,  comparing  onboard  vision-based  estimate  of  Leader  location 
with  Leader’s  reported  location  (GPS/INS  solution).  For  part  of  two  cycles  of  the 
circular  motion,  the  Follower  is  utilizing  the  vision-based  estimate  to  maintain  formation 
and  ignoring  the  reported  GPS/INS  solution,  (coordinates  are  North/East/Down  in  ft.  IP 

=  vision  based,  GPS  =  GPS/INS  solution) 


This  test  result  involved  some  of  the  simplest  of  the  approaches  developed  under  this 
project.  In  the  final  two  years  of  the  project,  we  anticipate  greatly  enhanced  performance 
as  we  incorporate  these  more  advanced  methods. 

Subsequently,  we  continued  to  validate  the  vision-based  segmentation,  estimation, 
guidance,  and  control  strategies  developed  previously  in  flight  testing.  Another 
significant  activity  was  bringing  on-line  a  second  airplane,  which  we  call  the  GTYak,  a 
33%  scale  Yak  aerobatic  airplane.  Figure  24.  This  has  enabled  us  to  switch  to  two- 
airplane  tests  with  two  airplanes  with  similar  performance  capabilities.  The  first 
formation  flight  with  this  aircraft  was  in  February  2007.  The  first  closed-loop  vision- 
based  tests  were  performed  in  July  2007. 
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Figure  24:  (right)  GTYak  vision-based  tracking  tested.  Camera  is  located  in  pod  on  right 
wing.  Image  processing  and  estimation  computer  is  in  canopy  area,  (left)  GYak  flying  in 
formation  with  GTEdge  “target”/“leader”  airplane  reported  on  previously. 

Sensor  Design 

Another  objective  of  this  effort  is  the  development  of  innovative  optical  imaging  systems 
for  3D  imaging  of  ground  and  airborne  targets  from  autonomous  vehicles.  Our  approach 
is  to  use  a  new  type  of  optical  elements,  called  “volume  holographic  lenses,”  (VHLs) 
which  perform  “optical  slicing”  on  reflective  objects  (such  as  tanks  and  trucks,  for 
example.)  Optical  slicing  means  that,  when  the  instrument  is  focused  on  a  certain  plane, 
only  the  portion  of  the  target  that  intersects  the  same  plane  is  visible;  the  remainder  is 
dark.  By  combining  several  focal  planes,  the  entire  3D  target  shape  is  reconstructed.  The 
benefits  of  this  approach  are  that  it  does  not  require  multiple  views  nor  structured 
illumination,  and  it  is  not  subject  to  ambiguities  such  as  the  “correspondence  problem”  in 
computer  vision.  On  the  other  hand,  the  VHL  method  forms  images  “one  line  at  a  time,” 
and  so  it  requires  2D  scanning  to  recover  the  3D  target  in  its  entirety.  The  flight  path  of 
the  autonomous  vehicle  itself  can  be  used  to  implement  the  required  scanning. 

During  the  first  year  we  developed  two  significant  improvements  on  the  operation  of 
VHLs,  namely  (i)  a  super-resolution  method,  based  on  the  Viterbi  algorithm,  which 
improves  the  depth  resolution  of  the  VHL  by  a  factor  of  5,  and  thus  permits  the 
instrument  to  see  features  of  the  target  which  are  finer  than  diffraction  theory  would  have 
predicted;  and  (ii)  a  method  to  reduce  scanning  from  2D  to  ID,  based  on  the  dispersion 
properties  of  VHLs,  which  permits  the  instrument  to  acquire  3D  images  much  faster  than 
we  had  planned  for  originally. 

Second  year  accomplishments  were  focused  on  a  new  use  of  Viterbi  which  combines  the 
increased  resolution  and  denoising  properties  with  reduction  in  scanning  time;  and  a  new 
method  for  acquiring  hyper-spectral  images  (spatial  as  well  as  “true”  -  non-RGB  -  color 
information)  with  passive  (sunlight)  illumination. 

We  have  previously  demonstrated  the  use  of  the  Viterbi  algorithm  for  improving  the 
quality  of  volume  holographic  images;  namely,  reducing  the  effects  of  noise  in  post¬ 
processing  when  the  desired  depth  resolution  is  finer  than  the  instrument’s  classical 
resolution.  However,  in  the  version  implemented  in  our  prior  research  the  required 
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number  of  scaimed  depths  was  equal  to  the  number  of  desired  reconstruction  depths.  We 
have  succeeded  in  implementing  a  new  version  where  the  number  of  scan  depths  can  be 
smaller  than  the  number  of  reconstructed  depths  by  a  factor  of  approximately  4  (even 
better  is  achievable  assuming  low  noise  conditions).  Thus,  the  Viterbi  algorithm  can  be 
thought  of  as  performing  a  kind  of  interpolation  in  this  case.  Further  reduction  in 
scanning  is  achieved  using  the  multiplexing  technique  for  volume  holograms;  namely, 
one  combines  two  or  more  volume  holographic  lenses  in  a  single  optical  element,  so  that 
the  lenses  image  simultaneously  separate  depths  in  the  target  space.  The  images  are 
separated  easily  because  the  multiplexed  holograms  are  capable  of  directing  each  image 
onto  a  separate  area  of  the  digital  camera  (or  to  separate  cameras.)  The  goal  of  the 
experiment,  experimental  arrangement,  and  typical  results  are  summarized  in  Figure  25. 
This  research  was  carried  out  in  collaboration  with  Prof  Mark  A.  Neifeld  of  the 
University  of  Arizona. 


Figure  25  (a)  Scanning  profilometry  with  Viterbi  interpolation.  The  thick  black  line 
denotes  the  cross-section  of  the  (reflective)  target  surface;  normally,  one  would  need  to 
scan  so  as  to  acquire  a  separate  image  for  each  column  of  voxels.  Yet  using  the  Viterbi 
algorithm  it  is  possible  to  acquire  only  two  images,  at  the  voxel  columns  denoted  with 
dotted  red  lines,  and  reconstruct  the  rest  of  the  target  based  on  these  two  depth  scans 
only,  (b)  Experimental  arrangement:  slices  #1  and  #2  correspond  to  the  red  dotted  lines  of 
Figure  MIT- 1(a),  and  are  being  imaged  by  the  volume  holographic  multiplex  lens  onto 
separate  positions  on  the  camera,  as  denoted  by  the  green  arrows;  thus  the  two  required 
images  are  acquired  in  a  single  step  in  this  experiment,  (c)  Reconstruction  of  a  LEGO® 
object  using  the  experimental  setup  of  Figure  MIT- 1(b).  8  depth  levels  were 
reconstructed  in  one  shot  in  this  experiment.  The  depth  resolution  was  1 .6mm  and  the 

working  distance  was  0.5m. 


We  originally  reported  a  novel  “rainbow”  volume  holographic  imaging  technique,  where 
the  target  is  illuminated  by  a  rainbow,  which  can  be  thought  of  as  a  multitude  of  colored 
slits  imaged  on  the  target.  The  volume  holographic  imager  is  capable  of  performing 
optical  slicing,  i.e.  acquiring  slice-wise  depth  information  from  each  slit  of  different  color 
in  parallel.  Subsequent  to  this,  we  succeeded  in  improving  this  technique  by  a 
modification  which  allows  us  to  (a)  utilize  passive  illumination,  e.g.  sunlight,  and  still 
recover  slice-wise  depth  information;  and  (b)  estimate  the  color  composition  of  the  target 
as  well  as  its  spatial  shape  in  the  same  step  (whereas  the  original  rainbow  technique 
required  a  color  scanning  step  for  non-white  targets).  Due  to  its  passive  nature,  we  refer 
to  the  new  technique  as  “Sim  Light”  volume  holographic  imaging.  The  experimental 
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arrangement  of  the  Sun  Light  method,  and  typical  experimental  results  obtained  with  a 
home-made  reflective  target,  are  summarized  in  Figure  26. 


Figure  26:  The  Sun  Light  volume  holographic  imaging  method,  (a)  Experimental 
arrangement  showing  (for  simplicity)  a  flat,  tilted,  reflective  target  at  the  input  plane.  The 
white-light  (Sun  Light)  illumination  is  first  spectrally  analyzed  by  an  auxiliary  grating 
and  then  forwarded  to  the  volume  holographic  lens,  which  completes  one  rotation  to  fill 
out  the  lateral  field  of  view,  as  shown,  (b)  Experimental  results,  clockwise  from  bottom 
left:  resconstruction  of  the  tilted  target  profile,  indicating  the  accuracy  of  the 
measurement  (better  than  20pm);  and  depth-selective  images  of  three  (arbitrarily  chosen) 
spectral  components:  cyan,  light  green,  and  orange-red. 
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AFRL  Point  of  Contact  and  Interactions 


Primary  AFRL  point  of  contact  was  Johnny  Evers,  AFRL/MNGN,  Eglin  AFB,  FL,  850- 
882-2961x2347. 

We  have  had  technical  interactions  with  AFRL/MNGN  and  AFRL/VACA  regarding 
flight  tests  for  unmanned  aerial  vehicles  with  vision-based  guidance  policies.  One  area  of 
interaction  involved  improving  the  nominal  autopilot  design  by  making  it  adaptive. 
Calise,  Tannenbaum,  Hovakimyan,  Betser,  and  Vela  had  extended  visits  to  support 
related  activities.  Weekly  teleconferences  August-December  2004  and  other  reports  were 
written  related  to  adaptive  autopilot  design,  adaptive  guidance  and  integration  of  an  IMU 
with  GPS  measurements.  An  adaptive  autopilot  was  auto-coded  from  Simulink  and 
integrated  at  Eglin  by  Ali  Kutay,  a  GRA  working  under  the  direction  of  Prof  Calise. 
Stephen  Card,  a  Georgia  Tech  student  who  spends  summers  working  at  AFRL/MNGN  on 
these  activities,  also  worked  on  these  projects  during  the  academic  year  while  at  Georgia 
Tech. 

We  collaborated  with  Dr.  J.V.R.  Prasad  at  Georgia  Tech  and  Sikorsky  on  a  project  to 
implement  adaptive  guidance  laws  for  unmanned  helicopter  formation  flight  with  ground 
and  aerial  targets.  The  adaptive  guidance  laws  were  flight  tested  using  the  Georgia  Tech 
rotorcraft  UAV  GTMAX  while  maintaining  range  from  a  ground  target.  The  designs  and 
software  were  transitioned  to  Sikorsky  for  integration  into  a  high  fidelity  simulator. 

A  number  of  technical  conferences  were  attended  by  members  of  the  team,  several  also 
by  AFRL,  including  CDC,  ACC,  GNC,  ECCV,  CLEO/IQEC,  SPIE,  and  others.  A  special 
sessions  relating  to  this  project  have  taken  place  for  the  2004  CDC  and  the  2005  ACC. 
One  is  planned  for  the  2005  GNC. 
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